Genome wide identification and classification of alternative splicing based on EST data
نویسندگان
چکیده
MOTIVATION Alternative splicing is currently seen to explain the vast disparity between the number of predicted genes in the human genome and the highly diverse proteome. The mapping of expressed sequences tag (EST) consensus sequences derived from the GeneNest database onto the genome provides an efficient way of predicting exon-intron boundaries, gene structure and alternative splicing events. However, the alternative splicing events are obscured by a large number of putatively artificial exon boundaries arising due to genomic contamination or alignment errors. The current work describes a methodology to associate quality values to the predicted exon-intron boundaries. High quality exon-intron boundaries are used to predict constitutive and alternative splicing ranked by confidence values, aiming to facilitate large-scale analysis of alternative splicing and splicing in general. RESULTS Applying the current methodology, constitutive splicing is observed in 33,270 EST clusters, out of which 45% are alternatively spliced. The classification derived from the computed confidence values for 17 of these splice events frequently correlate (15/17) with RT-PCR experiments performed for 40 different tissue samples. As an application of the confidence measure, an evaluation of distribution of alternative splicing revealed that majority of variants correspond to the coding regions of the genes. However, still a significant fraction maps to non-coding regions, thereby indicating a functional relevance of alternative splicing in untranslated regions. AVAILABILITY The predicted alternative splice variants are visualized in the SpliceNest database at http://splicenest.molgen.mpg.de
منابع مشابه
Role of Aberrant Alternative Splicing in Cancer
Alternative splicing can alter genome sequence and as a consequence, many genes change to oncogenes. This event can also affect protein function and diversity. The growing number of study elucidate the pathological influence of impaired alternative splicing events on numerous disease including cancer. Here, we would like to highlight the significant role of alternative splicing in cancer biolog...
متن کاملThe multiassembly problem: reconstructing multiple transcript isoforms from EST fragment mixtures.
Recent evidence of abundant transcript variation (e.g., alternative splicing, alternative initiation, alternative polyadenylation) in complex genomes indicates that cataloging the complete set of transcripts from an organism is an important project. One challenge is the fact that most high-throughput experimental methods for characterizing transcripts (such as EST sequencing) give highly detail...
متن کاملAVATAR: A database for genome-wide alternative splicing event detection using large scale ESTs and mRNAs
UNLABELLED In the past years, identification of alternative splicing (AS) variants has been gaining momentum. We developed AVATAR, a database for documenting AS using 5,469,433 human EST sequences and 26,159 human mRNA sequences. AVATAR contains 12000 alternative splicing sites identified by mapping ESTs and mRNAs with the whole human genome sequence. AVATAR also contains AS information for 6 e...
متن کاملAn Algorithm for Classification of Alternative Splicing and Transcriptional Initiation and Its Genome-Wide Application
We developed an algorithm that classifies all observed units of alternative splicing and transcriptional initiation and termination (UASTs) into an extendable set of distinct elementary patterns, when a collection of alignments between genomic DNA sequences and a set of cDNA/EST sequences are provided. The algorithm first converts aligned exon-intron structures into bit arrays, extracts UASTs, ...
متن کاملTranscriptome and Genome Conservation of Alternative Splicing Events in Humans and Mice
Combining mRNA and EST data in splicing graphs with whole genome alignments, we discover alternative splicing events that are conserved in both human and mouse transcriptomes. 1,964 of 19,156 (10%) loci examined contain one or more such alternative splicing events, with 2,698 total events. These events represent a lower bound on the amount of alternative splicing in the human genome. Also, as t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 20 16 شماره
صفحات -
تاریخ انتشار 2004